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Abstract. It has long been known, since the classical work of (Arora, Karger, Karpinski, JCSS 99), 
that Max-CUT admits a PTAS on dense graphs, and more generally, MAX-fc-CSP admits a PTAS 
on “dense” instances with Q{n^) constraints. In this paper we extend and generalize their exhaustive 
sampling approach, presenting a framework for (1 — £)-approximating any MAX-fc-CSP problem in 
sub-exponential time while significantly relaxing the denseness requirement on the input instance. 
Specifically, we prove that for any constants 5 € (0, 1] and e > 0, we can approximate MAX-fc-CSP 
problems with constraints within a factor of (1 — e) in time 2*^^" in^/e framework 

is quite general and includes classical optimization problems, such as Max-CUT, Max-DICUT, MAX-fc- 
SAT, and (with a slight extension) Ic-Densest Subgraph, as special cases. For Max-CUT in particular 
(where fc = 2), it gives an approximation scheme that runs in time sub-exponential in n even for “almost- 
sparse” instances (graphs with edges). 

We prove that our results are essentially best possible, assuming the ETH. First, the density requirement 
cannot be relaxed further: there exists a constant r < 1 such that for all 5 > 0, MAX-fc-SAT instances 
with clauses cannot be approximated within a ratio better than r in time 2®*-" K Second, 

the running time of our algorithm is almost tight for all densities. Even for Max-CUT there exists 
r < 1 such that for all 5' > 5 > 0, Max-CUT instances with edges cannot be approximated within 

1 - 5 ' 

a ratio better than r in time 2 


1 Introduction 

The complexity of Constraint Satisfaction Problems (CSPs) has long played a central role in the¬ 
oretical computer science and it quickly became evident that almost all interesting CSPs are NP- 
complete [29]. Thus, since approximation algorithms are one of the standard tools for dealing with 
NP-hard problems, the question of approximating the corresponding optimization problems (Max- 
CSP) has attracted significant interest over the years (SOj. Unfortunately, most CSPs typically resist 
this approach: not only are they APX-hard [24] , but quite often the best polynomial-time approx¬ 
imation ratio we can hope to achieve for them is that guaranteed by a trivial random assignment 
|22j . This striking behavior is often called approximation resistance. 

Approximation resistance and other APX-hardness results were originally formulated in the 
context of polynomial-time approximation. It would therefore seem that one conceivable way for 
working around such barriers could be to consider approximation algorithms running in super¬ 
polynomial time, and indeed super-polynomial approximation for NP-hard problems is a topic 
that has been gaining more attention in the literature recently |11ISI7I12I1.STT3] . Unfortunately, the 
existence of quasi-linear PCPs with small soundness error, first given in the work of Moshkovitz 
and Raz |25|, established that approximation resistance is a phenomenon that carries over even 
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to sub-exponential time approximation, essentially “killing” this approach for CSPs. For instance, 
we now know that if, for any e > 0, there exists an algorithm for Max- 3-SAT with ratio 7/8 + e 
running in time 2”' ^ this would imply the existence of a sub-exponential exact algorithm for 3- 
SAT, disproving the Exponential Time Hypothesis (ETH). It therefore seems that sub-exponential 
time does not improve the approximability of CSPs, or put another way, for many CSPs obtaining 
a very good approximation ratio requires almost as much time as solving the problem exactly. 

Despite this grim overall picture, many positive approximation results for CSPs have appeared 
over the years, by taking advantage of the special structure of various classes of instances. One 
notable line of research in this vein is the work on the approximability of dense CSPs, initiated 
by Arora, Karger and Karpinski [4] and independently by de la Vega |15j . The theme of this set 
of results is that the problem of maximizing the number of satished constraints in a CSP instance 
with arity k (Max-Zc-CSP) becomes significantly easier if the instance contains Q{n^) constraints. 
More precisely, it was shown in [1] that Max-Zc-CSP admits a polynomial-time approximation 
scheme (PTAS) on dense instances, that is, an algorithm which for any constant e > 0 can in 
time polynomial in n produce an assignment that satisfies (1 — e)OPT constraints. Subsequent 
work oroduced a stream of oositive |17I5I2I10I9I21|3|20|23] land some negative mm) results on 
approximating CSPs which are in general APX-hard, showing that dense instances form an island 
of tractability where many optimization problems which are normally APX-hard admit a PTAS. 

Our contribution: The main goal of this paper is to use the additional power afforded by sub¬ 
exponential time to extend this island of tractability as much as possible. To demonstrate the main 
result, consider a concrete CSP such as Max-3-SAT. As mentioned, we know that sub-exponential 
time does not in general help us approximate this problem: the best ratio achievable in, say, 2^^ 
time is still 7/8. On the other hand, this problem admits a PTAS on instances with f7(n^) clauses. 
This density condition is, however, rather strict, so the question we would like to answer is the 
following: Can we efficiently approximate a larger (and more sparse) class of instances while using 
sub-exponential time? 

In this paper we provide a positive answer to this question, not just for Max- 3-SAT, but also for 
any Max-Zc-CSP problem. Specifically, we show that for any constants <5 G (0,1], e > 0 and integer 
k >2, there is an algorithm which achieves a (1 — e) approximation of Max-A:-CSP instances with 
constraints in time 2^^^^ ^ \nn/e^)^ ^ notable special case of this result is for k = 2, where 
the input instance can be described as a graph. Eor this case, which contains classical problems 
such as Max-CUT, our algorithm gives an approximation scheme running in time 2 *^( 3 '“"■A ) fQj- 
graphs with average degree A. In other words, this is an approximation scheme that runs in time 
sub-exponential in n even for almost sparse instances where the average degree is Z\ = for some 
small 5 > 0. More generally, our algorithm provides a trade-off between the time available and the 
density of the instances we can handle. Eor graph problems {k = 2) this trade-off covers the whole 
spectrum from dense to almost sparse instances, while for general Max-A:-CSP, it covers instances 
where the number of constraints ranges from 0{n^) to 0{n^~^). 

Techniques: The algorithms in this paper are an extension and generalization of the exhaustive 
sampling technique given by Arora, Karger and Karpinski j4], who introduced a framework of 
smooth polynomial integer programs to give a PTAS for dense MAX-fc-CSP. The basic idea of that 
work can most simply be summarized for Max-CUT. This problem can be recast as the problem 
of maximizing a quadratic function over n boolean variables. This is of course a hard problem, but 
suppose that we could somehow “guess” for each vertex how many of its neighbors belong in each 
side of the cut. This would make the quadratic problem linear, and thus much easier. The main 
intuition now is that, if the graph is dense, we can take a sample of 0(log n) vertices and guess their 
partition in the optimal solution. Because every non-sample vertex will have “many” neighbors in 






this sample, we can with high confidence say that we can estimate the fraction of neighbors on each 
side for all vertices. The work of de la Vega [l5] uses exactly this algorithm for Max-CUT, greedily 
deciding the vertices outside the sample. The work of [3] on the other hand pushed this idea to its 
logical conclusion, showing that it can be applied to degree-fc polynomial optimization problems, 
by recursively turning them into linear programs whose coefficients are estimated from the sample. 
The linear programs are then relaxed to produce fractional solutions, which can be rounded back 
into an integer solution to the original problem. 

On a very high level, the approach we follow in this paper retraces the steps of [4]: we formulate 
Max-A:-CSP as a degree-A: polynomial maximization problem; we then recursively decompose the 
degree-A; polynomial problem into lower-degree polynomial optimization problems, estimating the 
coefficients by using a sample of variables for which we try all assignments; the result of this process 
is an integer linear program, for which we obtain a fractional solution in polynomial time; we then 
perform randomized rounding to obtain an integer solution that we can use for the original problem. 

The first major difference between our approach and [4] is of course that we need to use a larger 
sample. This becomes evident if one considers Max-CUT on graphs with average degree A. In 
order to get the sampling scheme to work we must be able to guarantee that each vertex outside 
the sample has “many” neighbors inside the sample, so we can safely estimate how many of them 
end up on each side of the cut. For this, we need a sample of size at least n log n/Z\. Indeed, we 
use a sample of roughly this size, and exhausting all assignments to the sample is what dominates 
the running time of our algorithm. As we argue later, not only is the sample size we use essentially 
tight, but more generally the running time of our algorithm is essentially optimal (under the ETH). 

Nevertheless, using a larger sample is not in itself sufficient to extend the scheme of [3] to non- 
dense instances. As observed in [3] “to achieve a multiplicative approximation for dense instances 
it suffices to achieve an additive approximation for the nonlinear integer programming problem”. 
In other words, one of the basic ingredients of the analysis of [3] is that additive approximation 
errors of the order en^ can be swept under the rug, because we know that in a dense instance the 
optimal solution has value l7(n^). This is not true in our case, and we are therefore forced to give 
a more rehned analysis of the error of our scheme, independently bounding the error introduced in 
the first step (coefficient estimation) and the last (randomized rounding). 

A further complication arises when considering Max-A;-CSP for k > 2. The scheme of [3] 
recursively decomposes such dense instances into lower-order polynomials which retain the same 
“good” properties. This seems much harder to extend to the non-dense case, because intuitively if 
we start from a non-dense instance the decomposition could end up producing some dense and some 
sparse sub-problems. Indeed we present a scheme that approximates Max-A:-CSP with 
constraints, but does not seem to extend to instances with fewer than constraints. As we will 
see, there seems to be a fundamental complexity-theoretic justification explaining exactly why this 
decomposition method cannot be extended further. 

To ease presentation, we first give all the details of our scheme for the special case of Max-CUT 
in Section [3l We then present the full framework for approximating smooth polynomials in Section 
m this implies the approximation result for Max-A:-SAT and more generally Max-A:-CSP. We then 
show in Section [5] that it is possible to extend our framework to handle A:-Densest Subgraph, a 
problem which can be expressed as the maximization of a polynomial subject to linear constraints. 
For this problem we obtain an approximation scheme which, given a graph with average degree 
A = gives a (1 — e) approximation in time ^ Observe that this extends the result 

of [3] for this problem not only in terms of the density of the input instance, but also in terms of 
k (the result of [3] required that k = Q{n)). 

Hardness: What makes the results of this paper more interesting is that we can establish that in 
many ways they are essentially best possible, if one assumes the ETH. In particular, there are at 


least two ways in which one may try to improve on these results further: one would be to improve 
the running time of our algorithm, while another would be to extend the algorithm to the range of 
densities it cannot currently handle. In Section [6] we show that both of these approaches would face 
significant barriers. Our starting point is the fact that (under ETH) it takes exponential time to 
approximate Max-CUT arbitrarily well on sparse instances, which is a consequence of the existence 
of quasi-linear PCPs. By manipulating such Max-CUT instances, we are able to show that for any 
average degree A = with <5 < 1 the time needed to approximate Max-CUT arbitrarily well 
almost matches the performance of our algorithm. Furthermore, starting from sparse Max-CUT 
instances, we can produce instances of Max-A:-SAT with 0{n^~^) clauses while preserving hardness 
of approximation. This gives a complexity-theoretic justification for our difficulties in decomposing 
Max-Aj-CSP instances with less than constraints. 

2 Notation and Preliminaries 

An n-variate degree-d polynomial p{x) is 13-smooth [3j, for some constant /S > 1, if for every 
i E {0, ... ,d}, the absolute value of each coefficient of each degree-^ monomial in the expansion 
of p{x) is at most (3n'^~^. An n-variate degree-d /3-smooth polynomial p{x) is 6-bounded, for some 
constant 6 E (0,1], if for every £, the sum, over all degree-.^ monomials in p(x), of the absolute 
values of their coefficients is 0(/3n'^~^~^^). Therefore, for any n-variate degree-d /3-smooth d-bounded 
polynomial p(x) and any x E {0, !}"■, \p{x)\ = 0(d/3n'^“^+'^). 

Throughout this work, we treat /3, <5 and d as fixed constants and express the running time of 
our algorithm as a function of n, i.e., the number of variables in p{x). 

Optimization Problem. Our approximation schemes for almost sparse instances of Max-CUT, 
Max-Zc-SAT, and MAX-fe-CSP are obtained by reducing them to the following problem: Given an 
n-variate d-degree /3-smooth d-bounded polynomial p{x), we seek a binary vector x* E {0,1}"' that 
maximizes p, i.e., for all binary vectors y E {0, 1}"^, p{x*) > p{y). 

Polynomial Decomposition and General Approach. As in [H Lemma 3.1], our general ap¬ 
proach is motivated by the fact that any n-variate d-degree /3-smooth polynomial p{x) can be 
naturally decomposed into a collection of n polynomials Pj{x). Each of them has degree d — 1 and 
at most n variables and is /3-smooth. 

Proposition 2.1 (|4]). Let p{x) be any n-variate degree-d jd-smooth polynomial. Then, there exist 
a constant c and degree-{d — 1) jd-smooth polynomials Pj{x) such that p{x) = c-\- XjPj{x). 

Proof. The proposition is shown in [U Lemma 3.1]. We prove it here just for completeness. Each 
polynomial Pj{x) is obtained from p{x) if we keep only the monomials with variable Xj and pull xj 
out, as a common factor. The constant c takes care of the constant term in p{x). Each monomial 
of degree i in p{x) becomes a monomial of degree — 1 in Pj{x), which implies that the degree of 
Pj{x) is d — 1. Moreover, by the /3-smoothness condition, the coefficient t of each degree-^ monomial 
in p{x) has \t\ < fdn‘^~^. The corresponding monomial in Pj{x) has degree £ — 1 and the same 
coefficient t with |f| < Therefore, if p{x) is /3-smooth, each Pj{x) is also /3-smooth. □ 

Graph Optimization Problems. Let G{V, E) be a (simple) graph with n vertices and m edges. 
For each vertex f E U, N{i) denotes i’s neighborhood in G, i.e., N{i) = {j E U : {i, j} E E}. We 
let deg(i) = |A^(i)| be the degree of f in G and A = 2|£'|/n denote the average degree of G. We 
say that a graph G is 5-almost sparse, for some constant 6 E (0,1], if m = (and thus, 

Z\ = 0{n^)). 

In Max-CUT, we seek a partitioning of the vertices of G into two sets So and Si so that the 
number of edges with endpoints in Sq and Si is maximized. If G has m edges, the number of edges 
in the optimal cut is at least m/2. 


In A;-Densest Subgraph, given an undirected graph G{V,E), we seek a subset C of A; vertices 
so that the induced subgraph G[C] has a maximum number of edges. 

Constraint Satisfaction Problems. An instance of (boolean) Max-A:-CSP with n variables 
consists of m boolean constraints fi,..., fm, where each fj : { 0 , 1 }^ —{ 0 , 1 } depends on k variables 
and is satisfiable, i.e., fj evaluates to 1 for some truth assignment. We seek a truth assignment to 
the variables that maximizes the number of satisfied constraints. Max-/c-SAT is a special case of 
Max-Aj-CSP where each constraint fj is a disjunction of k literals. An averaging argument implies 
that the optimal assignment of a MAX-fc-CSP (resp. Max-A;-SAT) instance with m constraints 
satisfies at least 2~^m (resp. (1 — 2~^)m) of them. We say that an instance of MAX-fc-CSP is 
5-almost sparse, for some constant 5 G (0,1], if the number of constraints is m = 

Using standard arithmetization techniques (see e.g., [U Sec. 4.3]), we can reduce any instance 
of Max-A:-CSP with n variables to an n-variate degree-k polynomial p{x) so that the optimal truth 
assignment for Max-/c-CSP corresponds to a maximizer x* G {0,1} of p{x) and the value of the 
optimal Max-Zc-CSP solution is equal to p{x*). Since each /c-tuple of variables can appear in at 
most 2^ different constraints, p{x) is /3-smooth, for /3 G [1,4^], and has at least m and at most 4^m 
monomials. Moreover, if the instance of MAX-fc-CSP has m = constraints, then p{x) is 

(5-bounded and its maximizer x* has p{x*) = 

Notation and Terminology. An algorithm has approximation ratio p G (0,1] (or is p-approximate) 
if for all instances, the value of its solution is at least p times the value of the optimal solution. 

For graphs with n vertices or CSPs with n variables, we say that an event E happens with high 
probability (or whp.), if E happens with probability at least 1 — for some constant c > 1. 

For brevity and clarity, we sometimes write a G (1 =t ei)/3 4= 627 , for some constants ei, 62 > 0, 
to denote that (1 — ei)/3 — 627 < a < (1 4- ei)/3 4- 627 . 

3 Approximating Max-CUT in Almost Sparse Graphs 

In this section, we apply our approach to Max-CUT, which serves as a convenient example and 
allows us to present the intuition and the main ideas. 

The Max-CUT problem in a graph G{V, E) is equivalent to maximizing, over all binary vectors 
X G {0,1}”, the following n-variate degree-2 2-smooth polynomial 

p{x)= ^ {Xi{l - Xj) + Xj{l - Xi)) 

Setting a variable xi to 0 indicates that the corresponding vertex i is assigned to the left side of 
the cut, i.e., to Sq, and setting Xi to 1 indicates that vertex i is assigned to the right side of the 
cut, i.e., to Si- We assume that G is (5-almost sparse and thus, has m = f7(n^+'^) edges and average 
degree A = Q{n^). Moreover, if m = p{x) is (5-bounded, since for each edge {i,j} G E, 

the monomial XiXj appears with coefficient —2 in the expansion of p, and for each vertex / G U, 
the monomial Xi appears with coefficient deg(/) in the expansion of p. Therefore, for i G {1, 2}, the 
sum of the absolute values of the coefficients of all monomials of degree £ is at most 2m = 0{n^^^). 

Next, we extend and generalize the approach of [1] and show how to (1 — e)-approximate the 
optimal cut, for any constant e > 0, in time )) (see Theorem ET]). The running time is 

subexponential in n, if G is (5-almost sparse. 

3.1 Outline and Main Ideas 

Applying Proposition 12. we can write the smooth polynomial p{x) as 

P(®) = - Pj{x)), 

j&v 


( 1 ) 


where Pj{x) = YlieN{j) ^ degree-1 1-smooth polynomial that indicates how many neighbors 
of vertex j are in S'! in the solution corresponding to x. The key observation, due to [1], is that if 
we have a good estimation pj of the value of each pj at the optimal solution x*, then approximate 
maximization of p{x) can be reduced to the solution of the following Integer Linear Program: 

max ^yj(deg(j) - pj) (IP) 

jev 

s.t. {I - ei)pj - e 2 A < Pi < {1 + €i)pj + e 2 A Vj E P 

i&N{j) 

Vj e {0,1} Vj E V 

The constants ei, e 2 > 0 and the estimations pj > 0 are computed so that the optimal solution x* 
is a feasible solution to (IP). We always assume wlog. that 0 < YlieN{j) the 

Ihs of the j-th constraint be max{(l — ei)pj — 62 ^ 1 ,0} and the rhs be min{(l -|- ei)pj + € 2 ^, deg(j)}. 
Clearly, if x* is a feasible solution to (IP), it remains a feasible solution after this modification. We 
let (LP) denote the Linear Programming relaxation of (IP), where each pj E [0,1]. 

The first important observation is that for any ei,e 2 > 0, we can compute estimations pj, by 
exhaustive sampling, so that x* is a feasible solution to (IP) with high probability (see Lemma f.S.ljl . 
The second important observation is that the objective value of any feasible solution y to (LP) is 
close to p{y) (see Lemma. Namely, for any feasible solution y, yj(deg(j) — pj) « p{y)- 

Based on these observations, the approximation algorithm performs the following steps: 

1. We guess a sequence of estimations pi,...,pn, by exhaustive sampling, so that x* is a feasible 
solution to the resulting (IP) (see Section [32] for the details). 

2. We formulate (IP) and find an optimal fractional solution y* to (LP). 

3. We obtain an integral solution 2 by applying randomized rounding to y* (and the method of 
conditional probabilities, as in |28|27j ). 

To see that this procedure indeed provides a good approximation to p{x*), we observe that: 

P{z) ~ ^i(deg(j) - Pj) ~ “ Pj) ^ - Pj) - P(^*) ’ (2) 

jev jev jev 

The first approximation holds because z is an (almost) feasible solution to (IP) (see Lemma l3.3p . 
the second approximation holds because the objective value of z is a good approximation to the 
objective value of y*, due to randomized rounding, the inequality holds because x* is a feasible 
solution to (LP) and the final approximation holds because x* is a feasible solution to (IP). 

In Sections 13.31 and [3~Sl we make the notion of approximation precise so that p{z) > {l — e)p{x*). 
As for the running time, it is dominated by the time required for the exhaustive-sampling step. Since 
we do not know x*, we need to run the steps (2) and (3) above for every sequence of estimations 
produced by exhaustive sampling. So, the outcome of the approximation scheme is the best of the 
integral solutions z produced in step (3) over all executions of the algorithm. In Section 13.21 we 
show that a sample of size 0(n In n/A) suffices for the computation of estimations pj so that x* is 
a feasible solution to (IP) with high probability. If G is (5-almost sparse, the sample size is sublinear 
in n and the running time is subexponential in n. 

3.2 Obtaining Estimations pj by Exhaustive Sampling 

To obtain good estimations pj of the values Pj{x*) = Y2i^N{j) number of j’s neighbors 

in Si in the optimal cut, we take a random sample R QV of size 0{nlnn/A) and try exhaustively 




all possible assignments of the vertices in R to Sq and Si. If Z\ = I7(n^), we have = 

20 (ni different assignments. For each assignment, described by a 0/1 vector x restricted to R, 

we compute an estimation pj = {n/\R\) each vertex j G V, and run the steps (2) 

and (3) of the algorithm above. Since we try all possible assignments, one of them agrees with x* on 
all vertices of R. So, for this assignment, the estimations computed are pj = {n/\R\) YlieN{j)nR^i- 
The following shows that for these estimations, we have that pj{x*) ~ pj with high probability. 

Lemma 3.1. Let x be any binary vector. For all 01 , 0:2 > 0, we let 7 = 0 (l/(o^O 2 )) and let R 
be a multiset of r = 'ynlnn/A vertices chosen uniformly at random with replacement from V. For 
any vertex j, if pj = (n/r) Yl,ieN{j)nR Pi ~ with probability at least 1 — 2/n^, 

(1 — ai)pj — (1 - 01 ) 02/1 < Pj < (1 + ai)pj + (1 + 01 ) 02/1 (3) 

Sketch of proof. If pj = f2{A), the neighbors of j are well-represented in the random sample R whp., 
because \R\ = 0(nlnn//\). Therefore, \pj — Pj\ < otipj whp., by Chernoff bounds. If pj = o(/\), the 
lower bound in ([3|) becomes trivial, since it is non-positive, while pj > 0. As for the upper bound, 
we increase some Xi to x' G [0,1], so that p'- = 02 /!. Then, p'- < (1 -|- oi)p'- = (1 -|- 01 ) 02 /! whp., 
by the same Chernoff bound as above. Now the upper bound of ([3]) follows from pj < pt, which 
holds for any instantiation of the random sample R. The formal proof follows from Lemma 14.11 
with jS = 1, d = 2 and q = 0, and with A instead of □ 

We note that pj > 0 and always assume that pj < deg(j), since if pj satishes ([3|), min{pj, deg(j)} 
also satisfies ([3]). For all 61,62 > 0, setting oi = and 02 = 62 in Lemma [3Tl and taking the 
union bound over all vertices, we obtain that for 7 = 0 ( 17 ( 6 ^ 62 )), with probability at least 1 — 2 /n^, 
the following holds for all vertices j G I/: 

(1 - ei)pj - 62 /! < Pj < (1 + ei)pj + 62 /! (4) 

Therefore, with probability at least 1 — 2/n^, the optimal cut x* is a feasible solution to (IP) with 
the estimations pj obtained by restricting x* to the vertices in R. 


3.3 The Cut Value of Feasible Solutions 

We next show that the objective value of any feasible solution y to (LP) is close to p{y). Therefore, 
assuming that x* is feasible, any good approximation to (IP) is a good approximation to the optimal 
cut. 

Lemma 3.2. Let pi,..., he non-negative numbers and y be any feasible solution to (LP). Then, 

P{y) e X] - Pi) 2(61 -b 62)m ( 5 ) 

i&v 

Proof. Using © and the formulation of (LP), we obtain that: 


p(y) = yd 


i&v 


- EyA^E Vi (deg(j) - ((1 T e-i)pj =F 62/!)) 

i&N{j) j i&V 

= E “ Pi) ^^^E yd Pi ^ ^2/! Vj 

iev jev jev 

^ E yii'^^sU) - Pj) ± 2(61 -b 62)m 

i&v 


deg(i) 



The first inclusion holds because y is feasible for (LP) and thus, Vi ^ 0-^ ^i)Pj =*= ^ 2 ^, for 

all j. The third inclusion holds because 


VjPj <YPj^Yl = 2m , 

jev jev jev 


since each pj is at most deg(j), and because ~ 


□ 


3.4 Randomized Rounding of the Fractional Optimum 


As a last step, we show how to round the fractional optimum y* = (yj,..., y*) of (LP) to an 
integral solution 2 ; = (zi,... ,Zn) that almost satisfies the constraints of (IP). 

To this end, we use randomized rounding, as in [28]. In particular, we set independently each 
Zj to 1, with probability y*, and to 0, with probability 1 — y*. By Chernoff bound^, we obtain that 
with probability at least 1 — 2/n®, for each vertex j. 


(I - ei)pj - - 2\/deg(j) ln(n) < ^ < (1 + ei)pj + € 2 ^ + 2A/deg(jyTn(n) (6) 


Specifically, the inequality above follows from the Chernoff bound in footnote jS] with k = deg(y) 
and t = 2y/deg{j)ln{n), since IE[X]i6Ar(j) Zj] = Y^i(zN{j) Vj ^ ^i)Pj ^2^1. By the union bound, 
l]S|) is satisfied with probability at least 1 — 2/n^ for all vertices j. 

By linearity of expectation, 2 ;j(deg(y) — pj)] = yj(deg(y) — pj). Moreover, since 

the probability that 2 : does not satisfy dGj) for some vertex j is at most 2/n^ and since the objective 
value of (IP) is at most n^, the expected value of a rounded solution z that satisfies ([6|) for all 
vertices j is least Y2jev ~ Pj) ~ 1 (assuming that n > 2). Using the method of conditional 

expectations, as in m, we can find in (deterministic) polynomial time an integral solution z that 
satisfies ([U]) for all vertices j and has “ Pj) ^ (deg(j) — Pj) — 1. Next, we 

sometimes abuse the notation and refer to such an integral solution z (computed deterministically) 
as the integral solution obtained from y* by randomized rounding. 

The following is similar to Lemma 13.21 and shows that the objective value p{z) of the rounded 
solution z is close to the optimal value of (LP). 


Lemma 3.3. Let y* he the optimal solution of (LP) and let z be the integral solution obtained 
from y* by randomized rounding (and the method of conditional expectations). Then, 


p{z) G Y 2 /j (deg(j) - Pj) ± 3(ei + e 2 )m 
j&v 


( 7 ) 


® We use the following standard Chernoff bound (see e.g., | 191 Theorem 1.1]): Let Vi ,... ,Yk independent random 
variables in [0,1] and let Y = "^hen for all t > 0, P[|y' — E[y]| > t] < 2exp(—2p/fc). 






Proof. Using ([ 6 ]) and an argument similar to that in the proof of Lemma 13.21 we obtain that: 


P{z) = '^Zj deg(j) - X] 

jeV \ i&N{j) j 

^ X] “ ((1 =F ei)Pi =F €2^ =F 2v^deg(j)In(n)^^ 

jev 

= ^Zj{degij)-pj)±€iY^ ZjPj ±62^ Zj^y deg{j)ln{n) 

jev jev jGV jev 

^ ^ ^i(deg(j) - Pj)± (3ei + 2e2)m 

j&V 

(deg(j) - Pj) ± 3(ei + € 2 )m 
jev 


The first inclusion holds because 2 satisfies ([ 6 ]) for all j G V. For the third inclusion, we use that 
Ylj&v^jPj — 'l2jev ~ ~ Jensen’s inequality, 

2 5 Z^i\/deg(j)lnn< ^ Y^4deg(j) Inn < V8mn Inn < eim , 
j&v j&v 

assuming that n and m = are sufficiently large. For the last inclusion, we recall that 

X^jey'^j(deg(j) — Pj) > X]jg\/ 2 /j (deg(j) — pj) — 1 and assume that m is sufficiently large. □ 

3.5 Putting Everything Together 

Therefore, for any e > 0, if G is J-almost sparse and A = n^, the algorithm described in Section [3.11 
with sample size 0(nlnn/(e^Z\)), computes estimations pj such that the optimal cut x* is a feasible 
solution to (IP) whp. Hence, by the analysis above, the algorithm approximates the value of the 
optimal cut p{x*) within an additive term of 0{em). Specifically, setting ei = e 2 = e/16, the value 
of the cut z produced by the algorithm satisfies the following with probability at least 1 — 2 /n^ : 

p{z) > ^ V*j{deg{j) - Pj) - 3em/8 > ^ x*(deg(j) - pj) - 3em/8 > p{x*) - em/2 > (1 - e)p{x*) 
jev jev 

The first inequality follows from Lemma 13.31 the second inequality holds because y* is the optimal 
solution to (LP) and x* is feasible for (LP), the third inequality follows from Lemma 13.21 and the 
fourth inequality holds because the optimal cut has at least m /2 edges. 

Theorem 3.1. Let G{V,E) be a 5-almost sparse graph with n vertices. Then, for any e > d, we 
can compute, in time \nn/e^) with probability at least 1 — 2/n^, a cut z of G with value 

p{z) > (1 — e)p{x*), where x* is the optimal cut. 

4 Approximate Maximization of Smooth Polynomials 

Generalizing the ideas applied to Max-CUT, we arrive at the main algorithmic result of the paper: 
an algorithm to approximately optimize /3-smooth J-bounded polynomials p{x) of degree d over all 
binary vectors x G {0,1}”. The intuition and the main ideas are quite similar to those in Section [3l 







but the details are significantly more involved because we are forced to recursively decompose 
degree d polynomials to eventually obtain a linear program. In what follows, we take care of the 
technical details. 

Next, we significantly generalize the ideas applied to Max-CUT so that we approximately 
optimize /3-smooth d-bounded polynomials p{x) of degree d over all binary vectors x G {0,1}”. The 
structure of this section deliberately parallels the structure of Section [3l so that the application to 
Max-CUT can always serve as a reference for the intuition behind the generalization. 

As in [4] (and as explained in Section [2]), we exploit the fact that any n-variate degree-d /3- 
smooth polynomial p{x) can be decomposed into n degree-(d — 1) /3-smooth polynomials Pj{x) 
such that p{x) = c + XjPj{x) (Proposition [2T]). For smooth polynomials of degree d > 3, we 

apply Proposition 12.11 recursively until we end up with smooth polynomials of degree 1. Specifically, 
using Proposition 12.11 we further decompose each degree-(d — 1) /3-smooth polynomial Pi^{x) into 
n degree-(d — 2) /3-smooth polynomials pi^j{x) such that Pii{x) = ^jPhji^)^ ^tc. At the 

basis of the recursion, at depth d — 1, we have /3-smooth polynomials (*) of degree 1, one 

for each (d — l)-tuple of indices (ii,..., id-i) G These polynomials are written as 


Pii 




_i(®) = c 




j&N 




where constants (these are the coefficients of the corresponding degree-d monomials 

in the expansion of p{x)). Due to /3-smoothness, < /3 and \ci^...id-i\ ^ Inductively, 

/3-smoothness implies that each polynomial Pii...ia-e{^) of degree l>lm. this decompositiorjil has 
\Pii...id-ii'^)\ ^ + l)/3n^ for all binary vectors x G {0,1}". Such a decomposition of p{x) in 

/3-smooth polynomials of degree d — 1, d — 2,..., 1 can be computed recursively in time 


4.1 Outline and General Approach 

As in Section [3] (and as in [4]), we observe that if we have good estimations Pii...id-e of the values 
of each degree-^ polynomial Pii...id-ei^) optimal solution x*, for each level = 1 ,..., d — 1 of 

the decomposition, then approximate maximization of p{x) can be reduced to the solution of the 
following Integer Linear Program: 


s.t. 




max Vjpj 
jeN 

ii + Y yjPiij ^ Pn ± 

j&N 

+ 'y ^ yjPiii2j ^ Piii2 ^ ^lPip2 ^ ^ 

j&N 


(d-IP) 


Mil G N 
V(zi,f 2 ) eN xN 


+ 'Y, yjPii.-.id-ij ^ Ph...id-e ^ £2^ ^ V(zi, . . . ,id-i) £ dM 

jeN 


d-£ 


^ Ph...id_i 4 = eipi^,,,id-i 4 = e 2 n 

j&N 

Vj ^ ( 0 ) 1 } 


V(ii,...,irf_i) G N- 


d-l 


Mj G N 


^ This decomposition can be performed in a unique way if we insist that ii < *2 < • • • < id-i, but this is not 
important for our analysis. 



Algorithm 1 Recursive estimation procedure Estimate(pij,,,j^_^(a;), R, s) 

Input: n-variate degree-^ polynomial RCA and a value Sj G {0,1} for each j G R 

Output: Estimation Pi^...ia-e Pii...id-t^)^ where s/j = s 


if £ = 0 then return Ci^...id /* Pn...id{^) is equal to the constant */ 


compute decomposition = Ci^...id_i + Hj&N^jPn-id-d^'^) 

for all j G A do 

Ph...id-d ^ Estimate(pq...i^_^j(ic),f - 1,R, s) 


Pil-.-id-t Ci^...id-t + 

return 


|A^| 

■f7?[ l^jeR^jPii---id-d 


In (d-IP), we also use absolute value estimations Pi^,,,i^_f^. For each level ^ > 1 of the decomposition 
of p{x) and each tuple (R,. .., id-i) G we define the corresponding absolute value estimation 

as Pi^...id-t = Ylj&N \Ph---id-d\ - Namely, each absolute value estimation Pi^...id-t at level £ is the sum 
of the absolute values of the estimations Pi^...id-d level f — 1. The reason that we use absolute 
value estimations and set the Ihs/rhs of the constraints to ± instead of simply to 

(l±ei)pq...j^_^, is that we want to consider linear combinations of positive and negative estimations 
Pii...id-e ™ ^ uniform way. 

Similarly to Section [3l the estimations (and Pi^...id-t) are computed (by exhaustive 

sampling) and the constants ei,e2 > 0 are calculated so that the optimal solution x* is a feasible 
solution to (d-IP). In the following, we let p denote the sequence of estimations for all 

levels f and all tuples (R,... ,id-e) G that we use to formulate (d-IP). The absolute value 

estimations Pi^...id-i can be easily computed from p. We let (d-LP) denote the Linear Programming 
relaxation of (d-IP), where each yj G [0,1], let x* denote the binary vector that maximizes p{x), 
and let y* G [0,1]” denote the fractional optimal solution of (d-LP). 

As in Section [3l the approach is based on the facts that (i) for all constants ei,e 2 > 0, we can 
compute estimations p, by exhaustive sampling, so that x* is a feasible solution to (d-IP) with 
high probability (see Lemma l4.ll and Lemma 14.21) : and that (ii) the objective value of any feasible 
solution y to (d-LP) is close to p{y) (see Lemma 14.31 and Lemma (4.4h . Based on these observations, 
the general description of the approximation algorithm is essentially identical to the three steps 
described in Section o and the reasoning behind the approximation guarantee is that of (l2|). 


4.2 Obtaining Estimations by Exhausting Sampling 


We first show how to use exhaustive sampling and obtain an estimation of the value at the 

optimal solution x* of each degree-^ polynomial Pii...id-ii^) ™ the decomposition of p{x). 

As in Section 13.21 we take a sample R from A, uniformly at random and with replacement. 
The sample size is r = 0{n^~^ Inn). We try exhaustively all 0/1 assignments to the variables in R, 
which can performed in time 2^ = 2^^” inn)^ 

For each assignment, described by a 0/1 vector s restricted to R, we compute the corresponding 
estimations recursively, as described in Algorithm [TJ Specifically, for the basis level i = 0 and each 
d-tuple {ii,...,id) G A'^ of indices, the corresponding estimation is the coefficient of the 

monomial • • • Xi^ in the expansion of p{x). For each level 1 < ^ < d — 1, and each (d —£)-tuple 
(ii,... ,id-i) G N^~^, given the level-(t' — 1) estimations Pij^...id-d j G A, we 





(8) 


compute the level-t* estimation of from s as follows: 

Tl V 

Pii...id-e = Cii-id-e + ~ ^jPh-id-d 

j&R 

In Algorithm [U s is any vector in {0,1}" that agrees with s on the variables of R. Given the 
estimations Pii...id-eji for all j G N, we can also compute the absolute value estimations Pi^...i^_^ at 
level 1. Due to the /3-smoothness property of p{x), we have that < /3n^, for all levels i > 0. 

Moreover, we assume that 0 < Pii...ia-i ^ and \pii...id-e\ < (-^ + l)/3n^, for all levels i > 1. This 
assumption is wlog. because due to /3-smoothness, any binary vector x is feasible for (d-IP) with 
such values for the estimations and the absolute value estimations . 

Remark 4-1- For simplicity, we state Algorithm [T] so that it computes, from s, an estimation Pi^...i^_i 
of the value of a given degree-^ polynomial So, we need to apply Algorithm [T] 

times, one for each polynomial that arises in the recursive decomposition, with the same 
sample R and the same assignment s. We can easily modify Algorithm [1] so that a single call 
Estimate(p(a;), d, s) computes the estimations of all the polynomials that arise in the recursive 
decomposition of p{x). Thus, we save a factor of d on the running time. The running time of the 
simple version is 0{dn'^), while the running time of the modified version is 0{n'^). 


4.3 Sampling Lemma 

We use the next lemma to show that if s = x*j^, the estimations Pi^...i^_i computed by Algorithm [T] 
are close to + Yjj&N ^*jPh-id-d Fig^ probability. 

Lemma 4.1. Let x he any binary vector and let {pj)j^N he any sequence such that for some 
integer q > 0 and some constant /3 > 1, pj € [0, (g -|- l)/3n'^], for all j G N. For all integers 
d> 1 and for all Q;i,a 2 > 0, we let 7 = 0(d(?/3/(af a 2 )) and let R he a multiset of r = yn^^'^lnn 
indices chosen uniformly at random with replacement from N, where d G (0,1] is any constant. If 
p = (n/r) Yhj&RPj^j p = YljeN Pj^j’ probability at least 1 — , 

(1 - ai)p - (1 - Q;i)a 2 n'^''''^ < p < (1 + ai)p + (1 + ai)a 2 n'^~^^ (9) 

Proof. To provide some intuition, we observe that if p = we have Q[n^) values pj = 0{n'^). 

These values are well-represented in the random sample R, with high probability, since the size of 
the sample is 0{n^~^ Inn). Therefore, \p — p\ < aip, with high probability, by standard Chernoff 
bounds. If /5 = o{n'^~^^), the lower bound in ([9]) becomes trivial, since it is non-positive, while /? > 0. 
As for the upper bound, we increase the coefficients pj to p'- G [0, {q + l)/3n'^], so that p' = a 2 n^^^. 
Then, p' < (1 + oti)p' = (I -|- ai)a 2 n'^'’''^, with high probability, by the same Chernoff bound as 
above. Now the upper bound of ([9]) follows from p < p', which holds for any instantiation of the 
random sample R. 

We proceed to formalize the idea above. For simplicity of notation, we let B = {q + I)/3n'? and 
02 = 0 . 2 /{{q + l)/3) throughout the proof. For each sample /, / = 1,..., r, we let Xi be a random 
variable distributed in [0,1]. For each index j, if the l-th sample is j, Xi becomes Pj/B, if Xj = 1, 
and becomes 0, otherwise. Therefore, ]E[A/] = p/{Bn). We let X = Xi. Namely, X is the sum 
of r independent random variables identically distributed in [0,1]. Using that r = yn^^'^lnn, we 
have that E[A] = 'yplnn/{Bn^) and that p = BnX/r = Bn^X/{'^\nn). 


We distinguish between the case where p > a 2 Bn^ and the case where p < a 2 Bn^. We start 
with the case where p > a 2 Bn^. Then, by Chernoff bound^, 

P[|X-E[X]| > aiE[X]] < 2exp(-^||^ 

< 2exp(—a^a27lnn/3) < 


For the second inequality, we use that p > a 2 Bn^. For the last inequality, we use that 7 > 3{d + 
l)/(af 02 ) = 3(d + l)(g' + l)j3/{a\a 2 ), since a 2 = Oi 2 /{{q + l)/3). Therefore, with probability at least 

1 - 


(1 - ai) 


'jp Inn 
Bn^ 


<X <{l + ai) 


7/5 Inn 
Bn^ 


Multiplying everything by Bn/r = Bn^/{'yln.n), we have that with probability at least 1 — 

(1 — ai)p < /5 < (1 + Oii)p, which clearly implies Q. 

We proceed to the case where p < a 2 Bn^. Then, (1 — ai)p < (1 — 01)02 -Bn'^ = (1 — ai)a 2 ^'^"'~‘^- 
Therefore, since /O > 0, because pj > 0, for all j G N, the lower bound of Q on p is trivial. For the 
upper bound, we show that with probability at least l — l/n‘^~^^,p < {l+ai)a 2 Bn^ = {l+ai)a 2 'nfl~^^. 
To this end, we consider a sequence {p'^)j^N so that pj < p'j < {q + 1)/Sn'?, for all j G N, and 
p' = YljeN P'j^j ~ a 2 BnP^^. We can obtain such a sequence by increasing an appropriate subset of 
Pj up to {q + 1 )/Sn'^ (if x does not contain enough I’s, we may also change some xj from 0 to 1). 
For the new sequence, we let p' = (n/r) 'Ylj&RP'j^j observe that p < p', for any instantiation 
of the random sample R. Therefore, 


F[p > (1 + ai)a 2 n'^~^^] < F[p' > (1 + oti)p'] , 


where we use that p' = a 2 Bn^ = a 2 n'^~^^. By the choice of p', we can apply the same Chernoff 
bound as above and obtain that F[p' > (1 + ai)p'\ < . □ 


Lemma O is enough for Max-CUT and graph optimization problems, where the estimations 
Ph—id-ej non-negative. For arbitrary smooth polynomials however, the estimations Pii...ia-d 
may also be negative. So, we need a generalization of Lemma 14.11 that deals with both positive 
and negative estimations. To this end, given a sequence of estimations {pj)j^N-, with pj G [—(g -|- 
1)/Sn'^, {q + l)/3n'^], we let p^ = max{/ 9 j, 0 } and pJ = min{pj,0}, for all j G N. Namely, (resp. 

p~) is equal to pj, if pj is positive (resp. negative), and 0, otherwise. Moreover, we let 


p+ = {n/r) ^ p +Xj , p+ = ^ p+Xj , p = (n/r) ^ p. Xj and p = ^ p- Xj 
j&R jeN jeR jeN 

Applying Lemma 14.11 once for positive estimations and once for negative estimations (with the 
absolute values of p~, p~ and p~, instead), we obtain that with probability at least 1 — d/n'^'*'^, the 
following inequalities hold: 

(1 - ai)p^ - (1 - ai)oi2'nP^^ < p’*' < (1 + ai)p^ + (1 + 01 ) 0 : 2 ’^^''''^ 

(1 -I- ai)p~ - (1 + 01 ) 0 : 2 ?^'^''''^ < P“ < (1 - oii)p~ + (1 - ai)Q;2n'^’'''^ 


Using that p 


p'’“ -|- p and that p = p'^ + p , we obtain the following generalization of Lemma (4.11 


® We use the following bound (see e.g., |191 Theorem 1.1]): Let Yi,..., Yl be independent random variables identically 
distributed in [0,1] and let Y = Yj. Then for all e € (0,1), P[|Y - E[Y]| > eE[Y]] < 2expl-e^ E[Y]/3). 






Lemma 4.2 (Sampling Lemma). Let x E {0, !}”■ and let be any sequence such that for 

some integer q > 0 and some constant f3 > 1, \pj\ < {q + , for all j E N. For all integers 

d> 1 and for all ai,a 2 > 0, we let 7 = 0{dql3 / {a\a 2 )) and let R he a multiset of r = Inn 

indices chosen uniformly at random with replacement from N, where 5 E (0, 1] is any constant. If 
p = (n/r) Yhj^RPj^j> P — Ylj^N Pj^j P ~ \Pj\’ probability at least 1 — 

p — aip — 2a2n'^~^^ < p < p + ctip + 2a2n^~^^ (10) 

For all constants 61,62 > 0 and all constants c, we use Lemma with oi = 61 and 02 = ^2/2 and 
obtain that for 7 = 0{dql3/{e\e 2 )), with probability at least 1 — 4/n'^+^, the following holds for any 
binary vector x and any sequence of estimations {pj)j^M produced by Algorithm [1] with s = xr 
(note that in Algorithm [H the additive constant c is included in the estimation p when its value is 
computed from the estimations pj). 

^ _p_^ P ^ _p_^ p 

c+^Yl I ^ c + ^ XjPj < C + ^ ^ PjXj +61 ^ \pj\ +62n^+'^ (11) 

j&R j&N j&N jeR j&N 

Now, let us consider (d-IP) with the estimations computed by Algorithm [1] with s = x*^ (he., with 
the optimal assignment for the variables in the random sample R). Then, using (llip and taking the 
union bound over all constraints, which are at most we obtain that with probability at least 

1 — 8/n^, the optimal solution x* is a feasible solution to (d-IP). So, from now on, we condition on 
the high probability event that x* is a feasible solution to (d-IP) and to (d-LP). 

4.4 The Value of Feasible Solutions to (d-LP) 

From now on, we focus on estimations p produced by Estimate(p(®), d, R, s), where i? is a random 
sample from N and s = x*j^, and the corresponding programs (d-IP) and (d-LP). The analysis in 
Section [4.21 implies that x* is a feasible solution to (d-IP) (and to (d-LP)), with high probability. 

We next show that for any feasible solution y of (d-LP) and any polynomial q{x) in the decom¬ 
position of p{x), the value of q{y) is close to the value of c -|- Ylj UjPj iii the constraint of (d-LP) 
corresponding to q. Applying Lemma 14.31 we show below (see Lemma 14.41) that p{y) is close to 
c -|- YljeN PjPjy he., to the objective value of y in (d-LP) and (d-IP), for any feasible solution y. 

To state and prove the following lemma, we introduce cumulative absolute value estimations 
) defined recursively as follows: For level £ = I and each tuple (ii,... ,id-i) £ we let 

III...id -1 ~ Pii...id-i — YljeN hor each level £ > 2 of the decomposition of p{x) and each 

tuple {ii,...,id-t) E we let fi^...id-e = Ph...id-e + HjeNRi-id-ej- Namely, each cumulative 

absolute value estimation is equal to the sum of all absolute value estimations that appear 

below the root of the decomposition tree of Pii...id-t{^)- 

Lemma 4.3. Let q{x) be any £-degree polynomial appearing in the decomposition of p{x), let 
q{x) = c+Y^-^j^Xjqj{x) he the decomposition ofq{x), let p and he the estimations of q and 

{qj}j^N produced by Algorithmic and used in (d-LP), and let f and {fj}j^N be the corresponding 
cumulative absolute value estimations. Then, for any feasible solution y of (d-LP) 

p-cif - £e 2 n^~^^^ < q{y) < p + cif + £e 2 n^~^~^^ (12) 

Proof. The proof is by induction on the degree £. The basis, for = 1, is trivial, because in the 
decomposition of q{x), each qj{x) is a constant Cj. Therefore, Algorithm [1] outputs pj = Cj and 

q{y) = c+J2 VjQji^) = c+J2 G d ± ew ± (- 2 n^ , 

i&N j&N 






where the inclusion follows from the feasibility of y for (d-LP). We also use that at level £ = 1, 
f = p (i.e., cumulative absolute value estimations and absolute value estimations are identical). 

We inductively assume that (fT^ is true for all degree-(^ — 1) polynomials qj{x) that appear in 
the decomposition of q{x) and establish the lemma for q{x) = c + YljeN have that: 

q{y) = c + ^ VjQjiy) (^pj ± eifj ± (i - l)e2n^"^+'^) 

jGN jeN 

= ( C + ^ VjPj I ± ei Vjfj l)e2 Y, 

\ jeN J jeN j&N 

£ (p =t eiP ± ei Yj 

j&N 

E p ± eif =b 

The first inclusion holds by the induction hypothesis. The second inclusion holds because (i) y is 
a feasible solution to (d-LP) and thus, c + Y^j^NViPj satisfies the corresponding constraint; (ii) 
Y.jeN'Vj'^j ^ (iii) Y.jeNVj ^ ^he last inclusion holds because f = p + 

by the definition of cumulative absolute value estimations. □ 

Using Lemma 14.31 and the notion of cumulative absolute value estimations, we next show that p{y) 
is close to c + '^j^NVjPj, for any feasible solution y. 

Lemma 4.4. Letp{x) = c+YljeN ^jPji^) decomposition ofp{x), let {pj}j^N be the estima¬ 

tions of {pj}j^N produced by Algorithm{J\ and used in (d-LP), and let {fj}j^N be the corresponding 
cumulative absolute value estimations. Then, for any feasible solution y of (d-LP) 

p{y) ^c+Y VjPj ^^^Y l)e2n'^"^+^ (13) 

jeN j&N 

Proof. By Lemma 14.31 for any polynomial pj, Pj{y) E pj ± cifj ± (d — l)e2n'^“^'’''^. Therefore, 
p{y) = c+Y VjPjiy) ^c+Yvj {pj ± eiU ^ {d - l)e2n'^"^+^) 

jeAf jeN 

= (^+Y y^pj y^^^ ±{d- i)e2 Y 

jeN j^N j^N 

G C + ^ yjPj ±eiY U ^{d- l)e2n'^"^+^ 
jew jGN 

The second inclusion holds because yj E [0,1] and Ylj&N Vj — 

4.5 Randomized Rounding of the Fractional Optimum 

The last step is to round the fractional optimum y* = {y\,..., y*) of (d-LP) to an integral solution 
z = {zi,..., Zn) that almost satisfies the constraints of (d-IP) and has an expected objective value 
for (d-IP) very close to the objective value of y*. 

To this end, we use randomized rounding, as in |28| . In particular, we set independently each Zj 
to 1, with probability y*, and to 0, with probability 1 — y*. The analysis is based on the following 
lemma, whose proof is similar to the proof of Lemma 14.11 


Lemma 4.5. Let y G [0, !]”■ be any fractional vector and let z G {0, !}"■ be an integral vector 
obtained from y by randomized rounding. Also, let {pj)j^M be any sequence such that for some 
integer q > 0 and some constant /d > 1, pj G [0, {q + for all j G N. For all integers k >1 

and for all constants a,(5 > 0 (and assuming that n is sufficiently large), if p = 
p = Ylj^N Pjyj’ probability at least 1 — 

(1 — a)p — (1 — a)an‘^^^ < yO < (1 + o:)p + (1 + a)an'^~^^ (14) 

Proof. We first note that E[/9] = p. li p = f?(n'?lnn), then \p — p\ < ap, with high probability, 
by standard Chernoff bounds. If /5 = o(n'^lnn), the lower bound in (I14p becomes trivial, because 
p > 0 and o(n'^lnn) < if n is sufficiently large. As for the upper bound, we increase the 

coefficients pj to /?'■ G [0, {q + l)/3n'?], so that p' = 0(n'?lnn). Then, the upper bound is shown as 
in the second part of the proof of Lemma 14.11 

We proceed to the formal proof. For simplicity of notation, we let B = (q + l)/3n'^ throughout 
the proof. For j = 1,..., n, we let Xj = ZjPj/B be a random variable distributed in [0,1]. Each Xj 
independently takes the value Pj/B, with probability yj, and 0, otherwise. We let X = be 

the sum of these independent random variables. Then, E[A] = p/B and X = YljeN ^jPj!^ ~ P/^- 
As in Lemma l4.ll we distinguish between the case where p > 2>{k + l)B\n.n/oP‘ and the case 
where p < 2>{k + l)B Inn/a^. We start with the case where p > ‘i{k+l)B Inn/a^. Then, by Chernoff 
bounds (we use the bound in footnote [5]) , 

2 ^ \ 

) < 2exp(-(A: + 1) Inn) < 2/n^+^, 
oB J 

where we use that p > ‘i{k + 1)B Inn/a^. Therefore, with probability at least 1 — , 


P[|X-E[X]| > aE[A]] < 2exp^ 


(1 — a)p/B < A1 < (1 + a)p/B 


Multiplying everything by B and using that X = p/B, we obtain that with probability at least 
1 — 2/n^^^, (1 — a)p < yO < (1 + a)p, which implies (fTT|l . 

We proceed to the case where p < 3{k + 1)B Inn/a^. Then, assuming that n is large enough 
that / In n > 3{k + l){q + l)P/a^, we obtain that (1 — a)p < (1 — a)an'^~^^. Therefore, since yO > 0, 
because pj > 0, for all j G N, the lower bound of (flTl) on p is trivial. For the upper bound, we show 
that with probability at least 1 — 1/n^^^, yO < (1 + a)an'^^^. To this end, we consider a sequence 
{p'j)j&N so that Pj < p'j < {q + l)/3n'?, for all j G N, and 


p' = Y1 m 

j&N 


3{k + 1)B Inn 


We can obtain such a sequence by increasing an appropriate subset of pj up to {q + l)/3n'^ (if 
YljGNy large enough, we may also increase some yj up to 1). For the new sequence, we let 

p' = YljeRP'j^j observe that p < p', for any instantiation of the randomized rounding (if some 
yj are increased, the inequality below follows from a standard coupling argument). Therefore, 

P[p > (1 + a)an'^^^] < P[p' > (1 + Oi)p] , 


where we use that p' = 3{k + l)B\iin /and that an^ > 3{k + l){q + l)/31nn/a^, which holds 
if n is sufficiently large. By the choice of p' , we can apply the same Chernoff bound as above and 
obtain that ^[p' > (1 + Oi)p'] < □ 



Lemma [4.51 implies that if the estimations pj are non-negative, the rounded solution 2 : is almost fea¬ 
sible for (d-IP) with high probability. But, as in Section 14.21 we need a generalization of Lemma 14.51 
that deals with both positive and negative estimations. To this end, we work as in the proof of 
Lemma Given a sequence of estimations {pj)j^N, with pj G [—(o' -|- + l)/?n'^], we 

define p'^ = max{pj,0} and pJ = min{pj,0}, for all j G N. Moreover, we let p'^ = Pf 

p+ = Y^j(zNP^yj, p~ = YjjdNP'j^j and p~ = 'Eji^NPjVj- Applying Lemma |431 once for positive 
estimations and once for negative estimations (with the absolute values of pJ , p~ and p ~, instead), 
we obtain that with probability at least 1 — 

(1 — a)p^ “ (1 “ a)an^^^ < < (1 + a)/5~’" -|- (1 + a)an‘^~^^ 

(1 -|- a)p~ — (1 -|- < (1 — ot)p~ + (1 — Q:)an'^''''^ 

Using that p = p'^ + p~ and that p = p'^ + p~, we obtain the following generalization of Lemma 14.51 


Lemma 4.6 (Rounding Lemma). Let y G [0,1]”' be any fractional vector and let z G {0,1}” 
be an integral vector obtained from y by randomized rounding. Also, let {pj)j^N be any sequence 
such that for some integer q > 0 and some constant f3 > 1, \pj\ < {q + l)/3n'^, for all j G N. 
For all integers k > 1 and for all constants a, <5 > 0 (and assuming that n is sufficiently large), if 
P = Y)jeN Pj^j’ P = Ylj&N PjVj P = YujeN \Pj\’ probability at least 1 - 4/n^+^, 

p — ap — 2an‘’^^ < p < p + ap + 2an'^~^^ ( 15 ) 

For all constants 61,62 > 0 and all constants c, we can use Lemma 14.61 with a = max{ 6 i, 62 / 2 } and 
obtain that for all integers A: > 1, with probability at least 1 — 4/n^^^, the following holds for the 
binary vector z obtained from a fractional vector y by randomized rounding. 



+ X] yjPi “ X] l/’jl <c+'^ ZjPj <c+'^ yjpj + ei ^ \Pj\ + 62 ^'?’^'^ (16) 

jeN jeN jeN j&N jeN 

Using (fT6|) with k = 2{d + 1), the fact that y* is a feasible solution to (d-LP), and the fact that 
(d-LP) has at most 2n'^“^ constraints, we obtain that z is an almost feasible solution to (d-IP) with 
high probability. Namely, with probability at least 1 — 8/n'^^^, the integral vector z obtained from 
the fractional optimum y* by randomized rounding satisfies the following system of inequalities for 
all levels I > 1 and all tuples (L,... .is-e) £ (for each level £ > 1 , we use o = £ — 1, since 

|/ 0 q...*,_«| for all jGiV). 

^iPh-.-id-d ^ Ph-.-id-i ^ ‘^^iPh...id-i ^ 262n (17) 

j&N 

Having established that z is an almost feasible solution to (d-IP), with high probability, we proceed 
as in Section ing By linearity of expectation, ^jPj] = Yljev y*jPr Moreover, the probability 

that z does not satisfy m for some level i > 1 and some tuple (ii,... ,id-i) £ ^ is at most 

and the objective value of (d-IP) is at most 2(d -|- l)/3n'^, because, due to the /3-smoothness 
property of p{x), |p(®*)| < {d + l)j3n^. Therefore, the expected value of a rounded solution z that 
satisfies the family of inequalities (fT7j) for all levels and tuples is least Yljev Pj Pj ~ ^ (assuming 
that n is sufficiently large). Using the method of conditional expectations, as in [27], we can find 
in (deterministic) polynomial time an integral solution z that satisfies the family of inequalities 


(fT71) for all levels and tuples and has c + ^ 3 Pi > c — 1 + YljeV Pj Pj- ™ Section [331 we 

sometimes abuse the notation and refer to such an integral solution z (computed deterministically) 
as the integral solution obtained from y* by randomized rounding. 

The following lemmas are similar to Lemma 14.31 and Lemma 14.41 They use the notion of cumu¬ 
lative absolute value estimations and show that the objective value p{z) of the rounded solution z 
is close to the optimal value of (d-LP). 

Lemma 4.7. Let y* he an optimal solution of (d-LP) and let z he the integral solution obtained 
from y* by randomized rounding (and the method of conditional expectations). Then, for any level 
i>l in the decomposition of p{x) and any tuple {ii,... ,id-e) £ N'^~^, 

G Ph...ia-t ± ± 2£e2n^"^+'^ (18) 

Proof. The proof is by induction on the degree i and similar to the proof of Lemma 14.31 The basis, 
for £ = 1, is trivial, because in the decomposition of p{x), each pj^,,,j^(a;) is a constant . 

Therefore, and 

= c+Y^ ZjPi^...i,,_^j{z) = c+Y^ G Ph...ia-i ± ± 2e2n^ , 

j^N jeN 

where the inclusion follows from the approximate feasibility of z for (d-LP), as expressed by ()17p . 
We also use that at level i = 1, = Pi^...id_i- 

We inductively assume that (fTHIl is true for the values of all degree-(£ — 1) polynomials Pii...id-ej 
at z and establish the lemma for Pii...id-ei^) — have that: 

Ph...id-A^) = Cii.+ Y1 

j&N 

G Zj ± — l)e2n ^ 

j&N 

= I C.i\...id-i + ^jPh---id-ej I =*= ^jPl-.-id-d ^ ZjU 

\ j&N J jeN j&N 

G {^Ph...id-i ± ± 2e2n^"^+^) ± 2ei ^ fi^...id-d ± 2(^ - l)e2n^"^+^ 

jeN 

G Ph...id-e ± 2eirii,„i^_^ ± 2i€2n^~^^^ 

The first inclusion holds by the induction hypothesis. The second inclusion holds because: (i) z 
is an approximately feasible solution to (d-IP) and thus, -|- J2jeN ^jPh-.-id-d satisfies (fT7|) : 

(ii) Y.jeN dPi-id-d - Ylj&N Pi-d-tP (hi) ^ The last inclusion holds because 

Pi...id-e ~ Ph-.-id-e + YljeN Pi-.-id-ej^ i^y definition of cumulative absolute value estimations. □ 

Lemma 4.8. Let y* be an optimal solution of (d-LP) and let z he the integral solution obtained 
from y* by randomized rounding (and the method of conditional expectations). Then, 

p{z) G c -|- ^ ZjPj ± 2ei ^ fj ± 2(d — l)e2n'^“^’''‘^ 
j&N jeN 


(19) 


Proof. By Lemma [4.71 for any polynomial pj appearing in the decomposition of p{x), we have that 
Pj{z) G Pj zb 2€ifj zb 2{d — Therefore, 

P{z) = c+Yl ^c+^Zj {^pj ± 2eirj- ± 2{d - 

j&N j£N 

= c + ^ ZjPj zb 2ei ^ Zjfj zb 2{d — l)e2 ^ 
jeAf jeAf jeAf 

G c + ^ zb 2ei ^ Tj zb 2((i — l)e2n'^“^^'^ 
jeAf jeAf 

The second inclusion holds because Zj G {0,1} and XljeAf 

4.6 Cumulative Absolute Value Estimations of d-Bounded Polynomials 

To bound the total error of the algorithm, in Section WTh we need an upper bound on 
on the sum of the cumulative absolute value estimations at the top level of the decomposition of 
a /3-smooth (3-bounded polynomial p{x). In this section, we show that YljeN^j ~ 0(d^/3n'^“^^'^). 
This upper bound is an immediate consequence of an upper bound of 0{d(3n'^~^~^^) on the sum of 
the absolute value estimations, for each level i of the decomposition of p{x). 

For simplicity and clarity, we assume, in the statements of the lemmas below and in their proofs, 
that the hidden constant in the definition of p{x) as a (5-bounded polynomial is 1. If this constant 
is some k > 1, we should multiply the upper bounds of Lemma 14.91 and Lemma l4.10l bv k. 

Lemma 4.9. Let p{x) he an n-variate degree-d jd-smooth 5-hounded polynomial. Also let Pi^...ij__i 
and Pi^...i^_i be the estimations and absolute value estimations, for all levels ^ G {1,... , d — 1} of 
the decomposition of p{x) and all tuples (ii,... ,id-i) € computed by AlgorithmU\ and used 

in (d-LP) and (d-IP). Then, for each level I > 1, the sum of the absolute value estimations is: 

Y, Pn...u-, < ( 20 ) 

Proof. The proof is by induction on the level i of the decomposition. For the basis, we recall that 
for ^ = 1, level-1 absolute value estimations are defined as 

Ph...id-i = \Ph-id-ij\ ~ 

jeN jeN 

This holds because, in Algorithm [H each level-0 estimation Pi^...id-iid equal to the coefficient 
Cii...id-lid of corresponding degree-d monomial. Hence, if p(a;) is a degree-d /3-smooth d-bounded 
polynomial, we have that 

'Yj Pii...id-i — Y1 ^ ^ (21) 

(h,...,L-i)eAr'i-i J)eAf'i 

The upper bound holds because by the definition of degree-d /3-smooth d-bounded polynomials, for 
each (. G {0,..., d}, the sum, over all monomials of degree d — i, of the absolute values of their 
coefficients is 0(/3n'^“^^^) (and assuming that the hidden constant is 1, at most /3n'^“^^^). In (j21l) . 
we use this upper bound for .^ = 0 and for the absolute values of the coefficients of all degree-d 
monomials in the expansion of p{x). 




For the induction step, we consider any level £>2. We observe that any binary vector x satisfies 
the level-(^ — 1) constraints of (d-LP) and (d-IP) with certainty, if for each level-(^ — 1) estimation, 

^ ~ + Ph-id-tj 

leN 

We also note that we can easily enforce such upper bounds on the estimations computed by Algo¬ 
rithm [TJ Since each level-£ absolute value estimation is defined as Pi^,,,i^_f^ = 
obtain that for any level ^ > 2, 


Ph-id-e — + Ph-id-tj) 


For the second inequality, we use the induction hypothesis and that since p{x) is /3-smooth and 
(5-bounded, the sum, over all monomials of degree d — £ + l, of the absolute values \ci^...id-ej\ of their 
coefficients Ci^...id-d i® most . We also use the fact that the estimations are computed 

over the decomposition tree of the polynomial p{x). Hence, each coefficient Ci^...id-d f® included 
only once in the sum. □ 


Lemma 4.10. Let p{x) be an n-variate degree-d (3-smooth 5-bounded polynomial. Also let 
be the cumulative absolute value estimations, for all levels £ G {1,... , d — 1} of the decomposition 
of p{x) and all tuples (zi,... ,id-e) £ , corresponding to the estimations computed by 

Algorithmic and used in (d-LP) and (d-IP). Then, 

^ fj < d(d - l)/3n'^-i+V2 (22) 

j&N 

Proof. Using induction on the level £ of the decomposition and Lemma 14.91 we show that for each 
level £ > 1, the sum of the cumulative absolute value estimations is: 

f^,...^,_,<{£ + l)£f3n^-^+^/2 (23) 

{h,-,id-t)&N'i-i 

The conclusion of the lemma is obtained by applying (I23p for the first level of the decomposition 
of p{x), i.e., for £ = d — 1. 

For the basis, we recall that for £ = 3, level-1 cumulative absolute value estimations are defined 
as = Pii...id-i - Using Lemma HTOl we obtain that: 


X] Pi-id-i = Ph-id-i < 

We recall (see also Section l4~4p that for each £ > 2, level-^ cumulative absolute value estimations are 
defined as %...id-t = Ph-d-t + Y.j&N Pi-d-d- Summing up over all tuples (fi,... ,id-£) G N^~^, 
we obtain that for any level £ >2, 


E 


T, 


i\...id-t 


llT_...ld-t3 


{h,...,id-e}eN'i-‘ 


( Ph...id-t A A] 

{h,-jd-i)&N<i-e \ j&N 

~ Ph-id-t + A] 

(ii,...,id-e)eN‘i-^ ih,...,id-e,j)eN‘i-‘-^ 

< £Pn^-^+^ + £{£ - l)/3n'^-^+V2 = (^ + l)£Pn‘^~^+^/2, 

where the inequality follows from Lemma 14.91 and from the induction hypothesis. 




□ 


4.7 The Final Algorithmic Result 

We are ready now to conclude this section with the following theorem. 

Theorem 4.1. Let p{x) be an n-variate degree-d 13-smooth 6 -bounded polynomial. Then, for any 
£> 0 , we can compute, in time \nn/s^) probability at least 1 — Sfn^, a binary 

vector z so that p{z) > p{x*) — , where x* is the maximizer of p{x). 

Proof. Based upon the discussion above in this section, for any constant e > 0, if p{x) is an n- 
variate degree-d /3-smooth d-bounded polynomial, the algorithm described in the previous sections 
computes an integral solution z that approximately maximizes p{x). Specifically, setting ei = 
e/(4d(d — l)/3) 62 = e/(8(d — 1)), p{z) satisfies the following with probability at least 1 — 8/n^ : 


f-(^) > I c + E - 2 d{d- Dfl S 

\ j&N j ^ jeN 

> ic+Y,yjPj] 

\ i67V / 


> 





> p{x*) — 


The first inequality follows from Lemma 14.81 The second inequality follows from the hypothesis 
that p{x) is /3-smooth and d-bounded. Then Lemma [4.101 implies that YhjeN^o — ■ 

As in Section 14.61 we assume that the constant hidden in the definition of p{x) as a d-bounded 
polynomial is 1. If this constant is some k> 1, we should also divide ei by k. The third inequality 
holds because y* is an optimal solution to (d-LP) and x* is a feasible solution to (d-LP). The 
fourth inequality follows from Lemma 14.41 For the last inequality, we again use Lemma 14.101 This 
concludes the proof of Theorem 14.11 □ 


Max-Zc-CSP: Using Theorem 14.11 it is a straightforward observation that for any Max-Zc-CSP 
problem (for constant k) we can obtain an algorithm which, given a Max-Zc-CSP instance with 
constraints for some d > 0, for any e > 0 returns an assignment that satisfies (1 — 
e)OPT constraints in time follows from Theorem 14.11 using two observations: 

first, the standard arithmetization of Max-/c-CSP described in Section [2] produces a degree-A; /3- 
smooth d-bounded polynomial for (3 depending only on k. Second, the optimal solution of such 
an instance satisfies at least constraints, therefore the additive error given in Theorem 

Kl\ is O(eOPT). This algorithm for Max-Zc-CSP contains as special cases algorithm for various 
standard problems such as Max-CUT, Max-DICUT and Max-Zc-SAT. 


5 Approximating the fc-DENSEST Subgraph in Almost Sparse Graphs 

In this section, we show how an extension of the approximation algorithms we have presented can 
be used to approximate the /c-Densest Subgraph problem in d-almost sparse graphs. Recall that 









this is a problem also handled in [3j, but only for the case where k = Q[n). The reason that smaller 
values of k are not handled by the scheme of [1] for dense graphs is that when k = o(n) the optimal 
solution has objective value much smaller than the additive error of en^ inherent in the scheme. 

Here we obtain a sub-exponential time approximation scheme that works on graphs with 
edges for all k by judiciously combining two approaches: when k is relatively large, we use a sampling 
approach similar to Max-CUT; when k is small, we can resort to the naive algorithm that tries all 
possible solutions. We select (with some foresight) the threshold between the two algorithms 
to be A: = so that in the end we obtain an approximation scheme with running time 

of ®inn)^ that is, slightly slower than the approximation scheme for Max-CUT. It is clear 

that the brute-force algorithm achieves this running time for k = so in the remainder 

we focus on the case of large k. 

The A:-Densest Subgraph problem in a graph G{V, E) is equivalent to maximizing, over all 
binary vectors x G {0,1}"’, the n-variate degree-2 1-smooth polynomial p{x) = ’ 

under the linear constraint Yljev ~ Setting a variable Xi to 1 indicates that the vertex i is 
included in the set C that induces a dense subgraph G[C] of k vertices. Next, we assume that G is 
(5-almost sparse and thus, has m = edges. As usual, x denotes the optimal solution. 

The algorithm follows the same general approach and the same basic steps as the algorithm for 
Max-CUT in Section [3l In the following, we highlight only the differences. 

Obtaining Estimations by Exhaustive Sampling. We first observe that if G is 5-almost sparse 
and k = then a random subset of k vertices contains edges in expectation. 

Hence, we can assume that the optimal solution induces at least edges. 

Working as in Section 13.21 we use exhaustive sampling and obtain for each vertex j G V, an 
estimation pj of j’s neighbors in the optimal dense subgraph, i.e., pj is an estimation of pj = 
Yli£N analysis, we apply Lemma [3T] with instead of A, or in other words, we use 

a sample of size Inn). The reason is that we can only tolerate an additive error of 

by the lower bound on the optimal solution observed in the previous paragraph. Then, the running 
time due to exhaustive sampling is 2^^^^ 

Thus, by Lemma 13.11 and the discussion following it in Section 13.21 we obtain that for all 
ei,e2 > 0, if we use a sample of the size lnn/(efe 2 )), with probability at least 1 — 2/n^, 

the following holds for all estimations pj and all vertices j £V: 

(1 - ei)pj - < Pj < (1 + ei)Pj + e2n^^^ (24) 

Linearizing the Polynomial. Applying Proposition 12.11 we can write the polynomial p{x) as 
p{x) = Yljev where Pj{x) = Yli&N{j) ^ degree-1 1-smooth polynomial that indicates 

how many neighbors of vertex j are in G in the solution corresponding to x. Then, using the estima¬ 
tions Pj of Yli&N(j) > obtained by exhaustive sampling, we have that approximate maximization 
of p{x) can be reduced to the solution of the following Integer Linear Program: 

max (IP') 

jev 

s.t. (I - ei)pj - € 2 n^^^ < Vi < (1 + €i)pj + e 2 n^^^ Vj G U 

i&NU) 

Vi = k 

i&N{j) 


Vj e {0,1} 


VjGU 


By (1^ . if the sample size is |i?| = lnn/(efe2)), with probability at least 1 — 2/n^, the 

densest subgraph x* is a feasible solution to (IP^ with the estimations pj obtained by restricting 
X* to the vertices in R. In the following, we let (LP') denote the Linear Programming relaxation 
of (IPO) where each yj E [0,1]. 

The Number of Edges in Feasible Solutions. We next show that the objective value of any 
feasible solution y to (LP') is close to p{y). Therefore, assuming that x* is feasible, any good 
approximation to (IPO is a good approximation to the densest subgraph. 

Lemma 5.1. Let pi, ..., pn be non-negative numbers and y be any feasible solution to (LP'). Then, 

p{y) E (1 ± ei) yjpj ± (25) 

j&v 

Proof. Using the decomposition of p{y) and the formulation of (LPO, we obtain that: 

p{y) = Yy^ ^ e2n^/^) 

jev ieN{j) jev 

= (1 ± ei) Y y^p^ ^ Y y^ 

jev j&v 

e (1 ± ei) Y yjpj 

j&v 

The first inclusion holds because y is feasible for (LPO and thus, Yli£N{j) Vi ^ ^ ^i)Pj 

for all j. The second inclusion holds because Yljev Vj — 

Randomized Rounding of the Fractional Optimum. As a last step, we show how to round the 
fractional optimum y* = {y\,..., y*) of (LPO to an integral solution z = {zi,..., Zn) that almost 
satisfies the constraints of (IPO- To this end, we use randomized rounding, as for Max-CUT. We 
obtain that with probability at least 1 — 2/n®, 

k — 2y^n ln(n) < 2Y0zdn(^ (26) 

j&v 

Specifically, the inequality above follows from the Chernoff bound in footnote^ with t = 2y ^n ln(n), 
since Zj] = k. Moreover, applying Lemma 03] with y = 0, /3 = 1, /c = 7, 6/3 (instead of 

6) and a = max{ei, € 2 / 2 }, and using that y* is a feasible solution to (LPO and that ei E (0,1), we 
obtain that with probability at least 1 — 2/re®, for each vertex j, 

(1 - eifpj - 2e2re'5/® < ^ < (1 + ei)Vi + 2e2n'^/^ (27) 

ieAf(i) 

By the union bound, the integral solution 2; obtained from y* by randomized rounding satisfies (1261) 
and (I27p . for all vertices j, with probability at least 1 — 3/re^. 

By linearity of expectation, ^jPj] ~ YljevyjPj- Moreover, since the probability that 

2: does not satisfy either (f26]l or ((271) . for some vertex j, is at most 3/re^, and since the objective 
value of (IPO is at most re^, the expected value of a rounded solution z that (f26]l and (1271) . for all 
vertices j, is least Yljev PjPj ~ ^ (assuming that re > 2). As in Max-CUT, such an integral solution 
z can be found in (deterministic) polynomial time using the method of conditional expectations 
(see [27]i. 

The following is similar to Lemma 1 5. II and shows that the objective value p{z) of the rounded 
solution z is close to the optimal value of (LP'). 





Lemma 5.2. Let y* be the optimal solution of (LP) and let z be the integral solution obtained 
from y* by randomized rounding (and the method of conditional expectations). Then, 

p{z) G (1 ± ei)2 ^ y*p^ ± 3e2ni+^/3 (28) 

jev 

Proof. Using the decomposition of p{y) and an argument similar to that in the proof of Lemma [5.11 
we obtain that; 


G ^ Zj (^(1 ± eifpj ± 26271.*^/^^ 

j&v 

= (1 ± ei)^ ^ ZjPj ± 2e2n‘^/^ ^ zj 

j&v j&v 

G (1 ± ei)2 ^ ZjPj ± 2e2n^+^/^ 
jev 

G (1 ± ei)^ X] y*iPi 36271^+"^/^ 
j&v 

The first inclusion holds because 2: satisfies ()27l) for all j G V. For the second inclusion, we use that 
Ylj^v U — inclusion, we recall that YljeV ^jPi — '^jev UjPj ~ ^ assume that n 

is sufficiently large. □ 

Putting Everything Together. Therefore, for e > 0, if G is (i-almost sparse and k = 17(77^“^/^), 
the algorithm described computes estimations pj such that the densest subgraph x* is a feasible 
solution to (IP^ whp. Hence, by the analysis above, the algorithm computes a slightly infeasible 
solution approximating the number of edges in the densest subgraph with k vertices within a 
multiplicative factor of (1 — ei)^ and an additive error of e277^'’“'^/^. Setting ei = 62 = e/8, the 
number of edges in the subgraph induced by 2; satisfies the following with probability at least 

1 - 2/re2: 

p{z) > — > (1—ei)^ 3e277^'*~*^/^ > p{x*)—£n^^^P^ > (1—e)p(a;*) 

j&v j&v 

The first inequality follows from Lemma 15.21 the second inequality holds because y* is the optimal 
solution to (LP) and x* is feasible for (LP), the third inequality follows from Lemma 1 5. II and the 
fourth inequality holds because the optimal cut has at least edges. 

This solution is infeasible by at most 2^/n In 77 = o{k) vertices and can become feasible by adding 
or removing at most so many vertices and edges. 

Theorem 5.1. Let G{V, E) he a 5-almost sparse graph with n vertices. Then, for any integer k > 1 
and for any £ > 0, we can compute, in time ^inn/e®) with probability at least 1 — ^jr?, 

an induced subgraph z of G with k vertices whose number of edges satisfies p{z) > (1 — £)p{x*), 
where x* is the number of edges in the /c-Densest Subgraph of G. 

6 Lower Bounds 

In this section we give some lower bound arguments which show that the algorithmic schemes we 
have presented are, in some senses, likely to be almost optimal. Our working complexity assumption 


ieu ieN{j) 



will be the Exponential Time Hypothesis (ETH), which states that there is no algorithm that can 
solve an instance of 3-SAT of size n in time 

Our starting point is the following inapproximability result, which can be obtained using known 
POP constructions and standard reductions. 

Theorem 6.1. There exist constants c, s G [0,1] with c > s such that for all e > 0 we have the 
following: if there exists an algorithm which, given an n-vertex 5-regular instance of Max-CUT, 
can distinguish between the case where a solution cuts at least a c fraction of the edges and the case 
where all solutions cut at most an s fraction of the edges in time 2"' " then the ETH fails. 

Proof. This inapproximability result follows from the construction of quasi-linear size PCPs given, 
for example, in [18]. In particular, we use as starting point a result explicitly formulated in |25] as 
follows: “Solving 3-SAT on inputs of size N can be reduced to distinguishing between the case that 
a 3CNP formula of size A^^+°T) is satisfiable and the case that only | + o(l) fraction of its clauses 
are satisfiable”. 

Take an arbitrary 3-SAT instance of size N, which according to the ETH cannot be solved 
in time . By applying the aforementioned PCP construction we obtain a 3CNE formula of 
size which is either satisfiable or far from satisfiable. Using standard constructions ( |26|6j ) 

we can reduce this formula to a 5-regular graph G{V, E) which will be a Max-CUT instance (we 
use degree 5 here for concreteness, any reasonable constant would do). We have that |U| is only a 
constant factor apart from the size of the 3CNE formula. At the same time, there exist constants 
c,s such that, if the formula was satisfiable G has a cut of c|E| edges, while if the formula was 
far from satisfiable G has no cut with more than s|E| edges. If there exists an algorithm that can 
distinguish between these two cases in time 2l^l^ " the whole procedure would run in 2^^ ^ and 

would allow us to decide if the original formula was satisfiable. □ 

There are two natural ways in which one may hope to improve or extend the algorithms we have 
presented so far: relaxing the density requirement or decreasing the running time. We prove in what 
follows that none of them can improve the results presented so far. 

6.1 Arity Higher Than Two 

Eirst, recall that the algorithm we have given for Max-A:-CSP works in the density range between 
and . Here, we give a reduction establishing that it’s unlikely that this can be improved. 

Theorem 6.2. There exists r > 1 such that for all e > 0 and all (fixed) integers k > 3 we have the 
following: if there exists an algorithm which approximates Max-Zc-SAT on instances with 
clauses in time then the ETH fails. 

Proof. Consider the Max-CUT instance of Theorem 16.11 and transform it into a 2-SAT instance 
in the standard way: the set of variables is the set of vertices of the graph and for each edge (rt, v) 
we include the two clauses (-lU V v) and {u V -lu). This is an instance of 2-SAT with n variables 
and 5n clauses and there exist constants c, s such that either there exists an assignment satisfying 
a c fraction of the clauses or all assignments satisfy at most an s fraction of the clauses. 

Eix a constant k and introduce to the instance {k — 2)n new variables i G {1, ... ,k — 2}, 

j G {1,... ,n}. We perform the following transformation to the 2-SAT instance: for each clause 
{h V I 2 ) and for each tuple {ii,i 2 , ■ ■ ■, ik- 2 ) £ {1; ■ ■ ■) we construct 2^“^ new clauses of size 

k. The first two literals of these clauses are always li,l 2 - The remaining k — 2 literals consist of the 
variables • • • ’^(^,4-2)’ '''^here in each clause we pick a different set of variables to be 

negated. In other words, to construct a clause of the new instance we select a clause of the original 




instance, one variable from each of the {k — 2) groups of n new variables, and a subset of these 
variables that will be negated. The new instance consists of all the size k clauses constructed in 
this way, for all possible choices. 

First, observe that the new instance has clauses and {k — l)n variables, therefore, for 

each fixed k it satisfies the density conditions of the theorem. Furthermore, consider any assignment 
of the original formula. Any satisfied clause has now been replaced by 2^ satisfied clauses, while for 
an unsatisfied clause any assignment to the new variables satisfies exactly 2^ — 1 clauses. Thus, for 
fixed k, there exist constants s', c' such that either a d fraction of the clauses of the new instance is 
satisfiable or at most a s' fraction is. If there exists an approximation algorithm with ratio better 
than c!/s' running in time 2^ ', where N is the number of variables of the new instance, we could 
use it to decide the original instance in a time bound that would disprove the ETH. □ 

6.2 Almost Tight Time Bounds 

A second possible avenue for improvement may be to consider potential speedups of our algorithms. 
Concretely, one may ask whether the (roughly) running time guaranteed by our scheme for 

Max-CUT on graphs with average degree y/n is best possible. We give an almost tight answer to 
such questions via the following theorem. 

Theorem 6.3. There exists r > 1 such that for all e > 0 we have the following: if there exists 
an algorithm which, for some A = o{n), approximates M.AX-CUT on n-vertex A-regular graphs in 
time " then the ETH fails. 

Proof fTheorem, \ 6.,'^) . Without loss of generality we prove the theorem for the case when the degree 
is a multiple of 10. 

Consider an instance G{V,E) of Max-CUT as given by Theorem 16.11 Let n = \V\ and suppose 
that the desired degree is d = WA, where Z\ is a function of n. We construct a graph G' as follows: 
for each vertex u £V we introduce A new vertices ui,..., ua as well as 5A “consistency” vertices 
cf,... ,c^^. For every edge {u,v) G E we add all edges {ui,Vj) for i,j G Also, for 

every u gV we add all edges (uj, Cj), for f G {1,..., A} and j G {1,..., 5A}. This completes the 
construction. 

The graph we have constructed is lOA-regular and is made up of 6An vertices. Let us examine 
the size of its optimal cut. Consider an optimal solution and observe that, for a given u G U all the 
vertices cf can be assumed to be on the same side of the cut, since they all have the same neighbors. 
Furthermore, for a given u G V, all vertices Ui can be assumed to be on the same side of the cut, 
namely on the side opposite that of cf, since the vertices cf are a majority of the neighborhood of 
each Ui- With this observation it is easy to construct a one-to-one correspondence between cuts in 
G and locally optimal cuts in G'. 

Consider now a cut that cuts c|E| edges of G. If we set all Ui of G' on the same side as u 
is placed in G we cut c\E\A^ edges of the form {ui,Vj). Furthermore, by placing the cf on the 
opposite side of Ui we cut 5A^|U| edges. Thus the max cut of G' is at least c\E\A‘^ + 5Z\^|U|. Using 
the previous observations on locally optimal cuts of G' we can conclude that if G' has a cut with 
s\E\A'^ + 5A^|U| edges, then G has a cut with s|E| edges. Using the fact that 2|E| = 5|U| (since 
G is 5-regular) we get a constant ratio between the size of the cut of G' in the two cases. Call that 
ratio r. 

Suppose now that we have an approximation algorithm with ratio better than r which, given 
an A-vertex d-regular graph runs in time 2^^/^'^ Giving our constructed instance as input to 
this algorithm would allow to decide the original instance in time 2"' ". □ 


Theorem 16.31 establishes that our approach is essentially optimal, not just for average degree y/n, 
but for any other intermediate density. 
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